Proof of concept: Remove the requirement to check out the gutenberg repository and running the full build script#11019
Conversation
With this new approach, a pre-built zip will be downloaded instead.
This is no longer necessary because a Git repository is no longer used.
This switches to downloading a prebuilt zip file uploaded to the GitHub Container Registry for a given commit instead. This eliminates the need to run any Gutenberg build scripts within `wordpress-develop`.
A REF is typically a human-readable path to a branch where as a SHA is a hash representing an individual commit. Since the latter is what's being used here, this aims to avoid confusion.
edf4224 to
2e3f506
Compare
gutenberg repository and running two build scriptsgutenberg repository and running the full build script
|
There is one remaining issue to figure out: how to map the I've removed the chunk of code responsible for mapping those files to their respective locations within |
|
It seems that the icons are not currently included in the Gutenberg plugin, only in @mcsf I see you introduced this in 5b7a3ad. Could you share a bit more context as to why these are not yet included in the Gutenberg plugin? I don't see them in the latest plugin version, but I may be missing them. |
dmsnell
left a comment
There was a problem hiding this comment.
this is a step up @desrosj and should avoid a lot of the delay when checking out the repo. 🙇♂️
does the zip contain the proper built versions of the artifacts or does it contain source copies? that is, are there any build steps that are occurring on the Gutenberg side that occur before that ZIP is available?
it would definitely be helpful to know in the docs, at least in download-gutenberg.js because I think it’s really not clear if we are downloading the Gutenberg source or some output. if output, I would love to know on the wordpress-develop side what our expectations are of that artifact and why .layers[0].digest is special.
Perhaps there is a command we could include which comparably would generate the right files were we given the Gutenberg source.
is it right that this still leaves the icon disparity in place?
| fs.readFileSync( packageJsonPath, 'utf8' ) | ||
| ); | ||
| sha = packageJson.gutenberg?.sha; | ||
| ghcrRepo = packageJson.gutenberg?.ghcrRepo; |
There was a problem hiding this comment.
are we short on bytes? the gh part is obvious to me as a reader, but I cannot figure out what cr means, and since this only appears to be documented inside this file, I’m not sure it would be clear from the package.json file, especially since it looks like a standard GitHub repo name.
perhaps something githubRepo would suffice? it doesn’t seem like the package.json value has anything specific to do with the Github Container Registry
There was a problem hiding this comment.
I am new to this as well, so I found that GHCR stands for the GitHub Container Registry
There was a problem hiding this comment.
I've added a note within the inlind docblock to explain this at the first occurrence of ghcrRepo. Does that address your concerns here @dmsnell?
I also changed the name of the package to gutenberg-wp-develop-build, making the full reference WordPress/gutenberg/gutenberg-wp-develop-build. It's more specific to the purpose, and hopefully won't lead to any confusion.
Within the GHCR, packages are published to a specific repository. While they are listed on a separate screen, you need to use the Org/Repository/package path to reference them.
| const installedHash = fs.readFileSync( hashFilePath, 'utf8' ).trim(); | ||
| if ( installedHash !== sha ) { | ||
| throw new Error( | ||
| `SHA mismatch: expected ${ sha } but found ${ installedHash }. Run \`npm run grunt gutenberg-download -- --force\` to download the correct version.` |
There was a problem hiding this comment.
at first I was going to ask about making this check before downloading, but I see it exists after downloading.
this is probably better because it informs someone that their files are out of date without eagerly downloading a new copy and replacing the old ones.
There was a problem hiding this comment.
I think I also want to create a file in the root of the repository to store a hash of the entire gutenberg directory immediately after unzipping. Then the user can also be made aware of instances where the files in the gutenberg directory were modified, which may be unexpected.
There was a problem hiding this comment.
this can backfire if we want to hash the entire directory. that can be an I/O-intensive task and either requires polling or reliable file watching.
with the solution already here: notify when expired, it gives agency to the user to download or ignore
| const packageJsonPath = path.join( rootDir, 'package.json' ); | ||
|
|
||
| /** | ||
| * Execute a command, streaming stdio directly so progress is visible. |
There was a problem hiding this comment.
Is this really streaming? It seems the stdout variable is populated with streaming, but the promise only resolves with the captured stdout when the child process is closed?
There was a problem hiding this comment.
Updated, how does that look?
Oof, I'm so glad you pinged me! Fix at WordPress/gutenberg#75866 |
The Gutenberg repository has been updated to include the icon assets in the plugin build. See github.com/WordPress/gutenberg/pull/75866.
…/github.com/desrosj/wordpress-develop into try/remove-gutenberg-git-checkout-and-build
- Add missing hard stop at the end of inline comment. - Make use if `fetch()` instead of `exec( 'curl' )`. Co-authored-by: Weston Ruter <westonruter@gmail.com>
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
| throw new Error( `Failed to download blob: ${ response.status } ${ response.statusText }` ); | ||
| } | ||
| const buffer = await response.arrayBuffer(); | ||
| fs.writeFileSync( zipPath, Buffer.from( buffer ) ); |
There was a problem hiding this comment.
at this point it’s probably possible to remove all of the _Sync versions and stream the output directly into the file.
not essential, but it does feel like we’re possibly going out of our way to do more work to avoid a simpler default.
await fs.writeFile( zipPath, response.body );There was a problem hiding this comment.
I'm not deeply knowledgeable here, but Claude told me that this wouldn't work as you've written "since fs.promises.writeFile doesn't accept a ReadableStream — stream.pipeline is the correct API for this." Does this seem right now?
…rg-git-checkout-and-build
…rg-git-checkout-and-build
…rg-git-checkout-and-build
| } else { | ||
| await exec( 'git', [ 'checkout', 'FETCH_HEAD' ], { | ||
| cwd: gutenbergDir, | ||
| } ); |
There was a problem hiding this comment.
this can fail if there are uncommitted changes. what’s the primary goal here? is it to update the code with recent changes?
should it preserve local changes that can be preserved?
could we simply abort and mention if there are uncommitted changes? let them figure it out rather than mess with their config if they don’t expect this to happen.
tools/gutenberg/checkout.js
Outdated
| const dirHashFile = path.join( rootDir, '.gutenberg-dir-hash' ); | ||
| if ( fs.existsSync( dirHashFile ) ) { | ||
| fs.rmSync( dirHashFile ); | ||
| } |
There was a problem hiding this comment.
this check introduces a race condition that doesn’t need to exist.
fs.rmSync( dirHashFile, { force: true } );like most FS calls, performing if checks before doing them (rather than catching errors afterward) opens up a range of bugs and exploits.
tools/gutenberg/utils.js
Outdated
| * Directories inside gutenberg/ to exclude when hashing. | ||
| * These are build artefacts or dependency caches, not source files. | ||
| */ | ||
| const DIR_HASH_EXCLUDE = new Set( [ 'build', 'node_modules', '.git' ] ); |
There was a problem hiding this comment.
what about build-module, build-style, and build-types, etc…
would it make more sense to read from the .gitignore and apply those rules?
There was a problem hiding this comment.
My intention here was to notify the user of any potentially unexpected discrepancy in the gutenberg directory vs. when it was created fresh from the pinned hash.
Essentially, any time there's a change to any source or built file, it could be an indication that there will be unexpected results when building. Looking at this again now with a fresh brain:
buildshould likely not be excluded (editing these files directly is incorrect and should be flagged)phpunit,tests, etc. should also be excluded.docs,platform-docs, etc.
Maintaining this list will be a pain, though.
There was a problem hiding this comment.
ultimately we only truly care about the files which get copied into Core, right? would we benefit from reusing that single file mapping from the copy step?
| checkRepoHash(); | ||
|
|
||
| // Warn if the source has changed since the last build. | ||
| checkDirHash(); |
There was a problem hiding this comment.
if this is the existing style, then whatever is fine, but these functions without return values are surprising and opaque. they need for the explanatory comments comes from the fact that their behavior is hidden inside them, rather than from being explicit in how we’re handling the responses.
note: checkRepoHash() appears to just do nothing if the directory is entirely missing, but I would have expected from the comment here that some warning would have been raised.
| } | ||
| function collectFiles( dir, excludeDirs = new Set() ) { | ||
| const files = []; | ||
| for ( const entry of fs.readdirSync( dir, { withFileTypes: true } ).sort( ( a, b ) => a.name.localeCompare( b.name ) ) ) { |
There was a problem hiding this comment.
what is the point of sorting the paths?
There was a problem hiding this comment.
This was a Claude suggestion.
The sorting ensures the hash is deterministic — the same directory contents always produce the same digest regardless of the order the filesystem returns entries.
I have not validated this yet, but it does make sense to me.
| * This stores the hash of the Gutenberg directory in a `.gutenberg-hash` file | ||
| * to track when changes have been made to those files locally. | ||
| * Symlinks are skipped because they can point to directories and would throw | ||
| * EISDIR when read as regular files. |
There was a problem hiding this comment.
if we didn’t want this warning here we could resolve symlinks with fs.readLinkSync() and then treat them like normal paths.
there are obviously valid reasons not to follow symbolic links, but confusion about whether we are reading the link itself vs. the thing it points to is not one I’m used to seeing.
This has better native support in Node.
My initial rough measurements were showing that the hashing takes around 150-200ms with the downloaded pre-built plugin files, and ~2.5 seconds with the full repository checkout. |
| response.body, | ||
| zlib.createGunzip(), | ||
| Writable.toWeb( tar.stdin ), | ||
| ); |
There was a problem hiding this comment.
ah darn, sorry if I leaned too hard into this and we still have to call out to the shell.
| } | ||
| } ); | ||
| child.on( 'error', reject ); | ||
| } ); |
There was a problem hiding this comment.
there is also spawnSync() if you don’t want all the setup and error code.
function runScript( scriptPath, args = [] ) {
const response = spawnSync( process.execPath, [ scriptPath, ...args ], {
cwd: rootDir,
stdio: 'inherit',
} );
if ( 0 === response.status ) {
return;
}
throw response.error;
}
This is a work in progress.
This PR attempts to remove the requirement to checkout the WordPress/gutenberg Git repository locally to run the
gutenberg repository build script as a part of thewordpress-develop` one.Instead a copy of the built plugin is downloaded from the GitHub Container Registry for the respective pinned hash value (see WordPress/gutenberg#75844 for that part of this approach). This is now the default behavior.
The ability to checkout and build from the
gutenbergplugin has been maintained. This is helpful in case a contributor wants to test changes to that repository (perhaps thewp-buildpackage) locally before pushing/creating a PR. This can mode can be activated by settingGUTENBERG_LOCAL_REPO=truelocally in the.envfile.Other changes in this PR:
reftoshato avoid confusion. Ref is typically a human-readable branch path. A commit hash or SHA value is the full length, unique hash for a commit.grunttasks have been renamed to followgutenberg:commandinstead ofgutenberg-commandpattern.gutenberghas been removed from all file names withintools/gutenbergsince it's redundant.Trac ticket: Core-64393.
Use of AI Tools
I used Claude Code pretty heavily throughout the work on this pull request.
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.